On the 3D point clouds–palm and coconut trees data set extraction and their usages

Objective Drone image data set can be utilized for field surveying and image data collection which can be useful for analytics. With the current drone mapping software, useful 3D object reconstruction is possible. This research aims to learn the 3D data set construction process for trees with open-source software along with their usage. Thus, we research the tools used for 3D data set construction, especially in the agriculture field. Due to the growing open-source community, we demonstrate the case study of our palm and coconut data sets against the open-source ones. Results The methodology for achieving the point cloud data set was based on the tools: OpenDroneMap, CloudCompare, and Open3D. As a result, 40 palm trees and 40 coconut tree point clouds were extracted. Examples of the usages are provided in the area of volume estimation and graph analytics.


Introduction
In agriculture, field surveying using drones is a common method to collect data.Drone images are used to analyze plant growth and crop yield.The collected data are stitched into 2D orthomosaic images.Combining other drone data with the georeference points, more information can be obtained such as height.This information can be used to construct 3D field models.
Constructing a 3D data set requires effort and involves many software tools.In the agriculture field, after drone flying, software is needed to perform orthomosaic and 3D construction.Current options available are divided into both commercial and non-commercial.Previous work in [1] compares orthomosaic and photogrammetry software.In the article, most mentioned ones are commercial such as DroneDeploy [2], Pix4D Mapper [3], AutoDesk R Recap [4], 3DF [5], Agisoft PhotoScan [6], while the open-source one is OpenDroneMap (ODM) [7].
For example, DroneDeploy is a platform with both enterprise and individual licenses available [2].As of 2023, plans start at 329 USD per month, allowing for up to 3K images per map.This includes services such as orthophoto, plant health, and GCP.Pix4D Mapper focuses on photogrammetry tasks.It creates 3D maps from 2D maps by constructing surfaces, volumes, and cloud points.The minimum monthly subscription for Pix4D Mapper is 291 USD, with a floating license available for 4,900 USD [3] (as of 2023) Agisoft PhotoScan is another one that focuses on photogrammetry which includes the feature of detecting powerlines.Three pricing models are node-lock license, floating license, and educational license [6].The basic edition of Agisoft PhotoScan offers features such as photogrammetric triangulation, dense point cloud generation and editing, 3D models generation and texturing, diffuse, occlusion, and normal texture map generation,etc.Undoubtedly, the software features are excellent, while pricing model may be unaffordable for beginners.Therefore, the open-source version is one of the solutions, as it can be deployed at no cost and further customized to specific needs.
In [8], 6 free drone mapper software were mentioned.Among these are DJI GS Pro and Pix4Dcapture which provide the flight planning feature.SkyeBrowse and DroneDeploy offer limited days for trial use.Open-DroneMap [7] is the option with the source code in github containing more than 2K stars, which is the target for our research.
Our research aims to study the process of 3D point clouds and their feature extraction using open-source drone mapping software.After the 3D data set is constructed, there can be many analytics applications upon it.WebODM [9] is the main selected tool for orthoimages and 3D point cloud constructions.The data set collected from palm and coconut field surveys in Thailand is the case study.The open-source tools were applied for all the pipeline steps as shown in Fig. 1: 1. WebODM [9] is based on OpenDroneMap (ODM)

Data source and overall data processing
[10] which has a scheduler to process various image processing tasks.An orothomosaic image and 3D point clouds were constructed.
2. CloudCompare [11] was utilized to extract each tree from the large 3D orthomosaic in 1). 3. Preprocessing such as outliner removal was done using CloudCompare and Open3D statistical outliner removal for each tree [12].4. WebODM is utilized to record the necessary annotations such as tree height, and volume.5.The 3D point clouds of each tree were used to create graph data based on voxels using Open3D with K-Nearest Neighbour algorithm [13].6.Finally, Networkx [14] library was used for graph construction and property extraction.

3D Feature extractions
After extracting each tree, the main stage is to extract its features which are useful for analytic model creation.
Open3D library was used to extract point cloud properties for each tree.The point cloud is visualized in Jupyter Notebook and the library extracts the point cloud including the number of points, volume size, point distance, number of mesh, etc.
Along with each crop tree, the ground truth of the tree such as volume size, and height were collected from WebODM for target labels.The results from the two tools enable the inference model constructions.
For example, to find the relationship between the actual tree height and the tree height from 3D point cloud geometry.The tree height can be directly extracted from the point cloud data.In WebODM, there is a measurement tool that can measure the height of the tree Fig. 1 Processing methods in 3D space.In step 2 of Section 2.1, the selected trees were measured their heights in meter units.
Next, the bounding box of the corresponding tree in pixels was collected using the Open3D function as a feature input.OrientedBoundingBox in Open3D [12] was utilized and the bounding box coordinates were recorded for each tree.

Graph features
After Step 3 of Section 2.1, the derived point clouds were exported as (x, y, z) coordinates.In constructing a graph, the K-nearest neighbor algorithm computes the neighbor coordinates and derives the edges and distance.Large voxels may result in a large graph leading to high computation time.The voxel was downsampled to reduce    the computation.The downsampling ratio used is 0.4 and the neighbor threshold was limited to 100.These values can be adjusted properly depending on the memory resource.Next, Networkx library was utilized to extract graph features [14].The feature includes the number of nodes, edges, triangles, cliques, clustering, connected components, etc.

Results
A total of 40 palm tree and coconut tree point clouds were extracted.For each tree, 12 attributes were collected in Tables 1, 2. In the tables, rows"abb_vol" and "obb_vol" correspond to axis-aligned bounding box and oriented bounding box respectively."avg_distance" is the average distance from nearest neighbors."bpa_mesh", "convex_ hull" and "poison" are different kinds of mesh algorithms.Each of them implies a different number of points and triangles shown in the corresponding rows.The statistical features of point clouds are also shown in Tables 1, 2 respectively.It presents the standard deviation, mean, min, max, and 25%-75% quartiles.Fig. 2a visualizes the point cloud comparison between two palm trees (green points and blue points.)There are some differences between the width and height of the two trees as in Fig. 2b.Fig. 3a visualizes the point cloud differences between two coconut trees (green points and blue points.)and Fig. 3b presents the box plot of the difference values.There are more differences than in Fig. 2a.
Tables 3, 4 present statistical data for 22 graph attributes derived from our methods.The selected graph attributes were related to nodes, edges, and subgraph structures.For instance, "max_ind_set" is the size of the maximum independent set."max_matching" is the subset of edges in which no node occurs more than once."num_clique" is the number of cliques "vertex_cover" Figs.4a, 5a visualize the graph attributes of twenty palm trees and coconut trees respectively.Figs.4b and 5b visualize the three types of distances of the two data sets.

Discussion
For the 3D point clouds, the number of points of the coconut tree is more than that of the palm tree while the volume of the coconut tree is less than that of the palm tree.It may be noticed that the standard deviation of the coconut data set is more, implying data may not be cleaned enough.One reason is the coconut tree is more difficult to crop since the shape of the tree top is quite varied.The ground point clouds attached to each tree during the cropping process overlap those of the tree which induces the outliers more than in those of the palm tree.Therefore, the coconut's mesh size is larger than that of a palm tree.Nevertheless, the derived volume and density can be used to estimate the tree size and richness after normalization has been done.
When properly cleaned, the derived properties can be used to build a machine-learning model estimating the crop size.To expand the usage, the algorithm to segment each tree point cloud automatically can be derived and the volume estimation can be performed for each tree.This will reduce the manual measurement and increase the effectiveness of inspecting the crop size.
Comparing the two graph data sets, though we use the same parameter setting to produce the values, the palm tree point clouds seem to be larger than those of the coconut trees.For the palm tree, there are some large trees for example, p30, which can be seen by a large number of nodes and connected components, (Fig. 4a), and for the coconut tree, there are a few that have about the same size such as rows 2-11, row 2-20, rows 2-27, etc. as in Fig. 5a.
The values of all distances are close to each other for the coconut data set, while the difference is more for the palm data set between g_distance, Weis_distance, and greedy_distance.For the derived data sets, and graph attributes, we demonstrated the classification and clustering results considering two classes: coconut and palm.Fig. 6 presents the score of each classification method.All approaches can distinguish coconut from palm trees.
On the other hand, we combined both data sets and applied a clustering algorithm to cluster the data set.The purpose is to demonstrate the similarity of the two classes.Fig. 7 compares the results from two clustering approaches, K Means and Birch [15].Two attributes 'g_eff ' and '# clique' are shown for the scatter plot.Compared to the original clustering in Fig. 7a, the K Means performs slightly better.Fig. 8 showed the common metrics for clustering results for five methods.
With these numeric attributes, other analytic opportunities are as follows.
1.The model to compute the size of the tree can be estimated by using these attributes.2. Some attributes may infer the tree density, such as strongly connected components, average neighbor degrees, number of triangles, etc. 3. The inexact subgraph matching [16] can be applied to segment parts of the tree.
4. Graph neural network [17] can be applied to find the model to identify the substructure of the tree.The substructure may imply certain characteristics of the plant.

Comparison to other works
Several works have been done about drone data sets.Most of them were found in urban survey areas.Drone mapper resources (https:// drone mapper.com/ sample_ data/ provides some urban survey images from many places such as Colorado and Switzerland. In agriculture, most published research utilizes data sets from orthomosaic images to perform analytics such as crop yields and 3D point cloud biomass.The whole orthomosaic image was used to calculate the yield indices and biomass.Commercial tools such as DroneDeploy, Pix4D, and Agisoft were utilized for preprocessing.Tunrayo et.al.[18] considered soybean grains yield prediction. Pixel4D was used to create orthomosaic for vegetation indices.Machine learning models were utilized for yield prediction. Acorsi et.al.[19] considered black oat trees with UAV images.Dronedeploy was utilized to perform the orthomosaic process and Agisoft Photoscan was utilized to create photogrammetry.They performed the biomass estimation for the derived photogrammetry.Table 5 compared the previous works that take advantage of point cloud data sets in various ways.It is found that the most common data source for point cloud construction is 3D cameras.On the other hand, our work utilizes the 3D point clouds constructed from SFM (as in [21]) while we utilize WebODM and provide different applicability with graph features.

Limitation
The data set was first derived using WebODM.The point clouds for the whole field contain many trees of various sizes.To manually extract the tree, since the whole point clouds are large, a computer with powerful resources is needed.Moving in 3D space with CloudCompare can be slower if the computer memory is less than 16G.The alternative is to partition the whole point clouds into smaller ones and work on the partition.
This work focuses on individual trees, and future work includes the design of the algorithm to to automate the analysis, e.g., estimating the crop size for the whole field.The graph for the whole field must be generated and the subgraph segmentation using various methods can be applied [25][26][27].
Our initial data set for the study was collected from the drone survey in 2022.The area size is 345,686.94m 2 and 224,573 m 2 respectively, for palm and coconut fields, in Pathum Thani province in Thailand.

Fig. 2 a
Fig. 2 a Palm tree point clouds b box plot comparison

Fig. 4 a
Fig. 4 a Palm tree graph attributes b distance comparison

Fig. 5 a
Fig. 5 a Coconut graph attributes b distance comparison

Fig. 7
Fig. 7 Comparing clustering approaches.a presents the original clusters while b shows the clusters obtained by K Means and c shows the clusters obtained by Birch

Fig. 8
Fig. 8 Scoring of several clusters

Table 1
Statistics for palm point clouds

Table 2
Statistics for coconut point clouds

Table 3
Statistics for graph data of palm point clouds

Table 4
Statistics for graph data of coconut point clouds

Table 5
Previous works that utilize tree point clouds